August 2022
R
(mostly) and Python
(a bit less).\(\Longrightarrow\) ENTERS PROGRAMMING!
Python is very versatile, and the community is strong.
It’s easy to find answers when you’re having troubles.
It’s free!
Alternatives (for finance):
Me and my python pet Java
You need to install Python, and an interface to communicate with you computer in Python.
Programming usually involves:
Doing so is efficient because it allows you to see the complete structure of what you’re asking to the computer.
There are many editors and compilers, but we’ll use a simple setup.
Basic Anaconda Navigator
Basic Anaconda Navigator
Updating Spyder
Spyder Interface
Spyder Interface
Spyder Interface
Spyder Interface
Python
as a powerful calculator and much more from now on.Python
is OpenSource and there is a huge community.Important coding principles:
Important libraries:
numpy
(matrix computation), matplotlib
(charts), pandas
(basic stats), ScyPi
(complex mathematics).Python
:my_variable = 42
my_variable
is a number (integer).new_var = my_variable - 5 print(new_var)
## 37
float
: numbers with decimalsmy_float = 100.0 print(my_float)
## 100.0
string
: a chain of charactersmy_string = "I'll be back" print(my_string)
## I'll be back
print(my_string[0:7])
## I'll be
list
: can mix numbers and stringsmy_list = [42, "I'll be back", 100.0] print(my_list)
## [42, "I'll be back", 100.0]
my_list_bis = [my_variable, my_string, my_float] print(my_list_bis)
## [42, "I'll be back", 100.0]
my_list_bis == my_list
## True
dictionaries
: a list of named elements (rather than numbered)my_dictionary = {} my_dictionary['forty two'] = my_variable my_dictionary['terminator'] = my_string print(my_dictionary)
## {'forty two': 42, 'terminator': "I'll be back"}
print(my_dictionary['terminator'])
## I'll be back
# print(my_dictionary[0]) => Does not work print(my_list[0])
## 42
Caution!
print(my_list)
## [42, "I'll be back", 100.0]
print(my_list[1])
## I'll be back
Python
with the numpy
library.numpy
:import numpy as np
my_vector = np.array([1.5,2.3]) print(my_vector)
## [1.5 2.3]
my_matrix = np.array([[1,2],[3,4]]) print(my_matrix)
## [[1 2] ## [3 4]]
print(my_matrix.shape)
## (2, 2)
my_multiplication = np.matmul(my_matrix, my_vector) print(my_multiplication)
## [ 6.1 13.7]
.T
instead):print(my_matrix.transpose())
## [[1 3] ## [2 4]]
mat_zeros = np.zeros((2,10)) print([mat_zeros, mat_zeros.shape])
## [array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], ## [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]]), (2, 10)]
mat_ones = np.ones((1,5)) print([mat_ones, mat_ones.shape])
## [array([[1., 1., 1., 1., 1.]]), (1, 5)]
mat_diag = np.eye(2) ; print(mat_diag)
## [[1. 0.] ## [0. 1.]]
mat_diag + 1; mat_diag - 1; mat_diag * 2; mat_diag/2
## array([[2., 1.], ## [1., 2.]]) ## array([[ 0., -1.], ## [-1., 0.]]) ## array([[2., 0.], ## [0., 2.]]) ## array([[0.5, 0. ], ## [0. , 0.5]])
mat1 = np.ones((2,2)); mat2 = np.array([[1,2],[3,4]]) mat1 + mat2; mat1 - mat2; mat1 * mat2; mat1/mat2
## array([[2., 3.], ## [4., 5.]]) ## array([[ 0., -1.], ## [-2., -3.]]) ## array([[1., 2.], ## [3., 4.]]) ## array([[1. , 0.5 ], ## [0.33333333, 0.25 ]])
numpy
.array.function
.np.zeros(nrow,ncol)
,np.ones(nrow,ncol)
,np.eye(nrow)
,np.diag(number1, ..., number n)
.Caution!
Remain careful about dimensions!
mat_diag + mat_ones
## Error in py_call_impl(callable, dots$args, dots$keywords): ValueError: operands could not be broadcast together with shapes (2,2) (1,5) ## ## Detailed traceback: ## File "<string>", line 1, in <module>
2 ** 3 ; pow(2,3) 2 ** 3/4 np.ones([2,2]) @ np.array([[1,2],[3,4]]) [1,'1'] * [3, '4'] a = 4 b = 5 a -= 3 + b ['a', 'b', 'c', 'd'][2] 'Stonks only go up'[5:9] ['Stonks only go up'][5:9] np.ones([1,5]) + np.diag([1,2])
linalg.inv
of numpy
.1. ALWAYS COMMENT YOUR CODE
""" comment """
.#
.A typical script:
""" 0. Import libraries """ import numpy as np """ 1. Building the needed matrices. """ my_matrix = np.array([[1,3],[2,4]]) # This matrix is (2x2) my_matrix2 = np.ones((2,5)) # This matrix is (2x5) """ 2. Performing our operations """ result = np.matmul(my_matrix, my_matrix2) # Once our result is computed, we can perform other operations
2. CHOOSE EXPLICIT NAMES FOR VARIABLES
X9
is.Example:
matrix_example_for_Python_course = np.array()
BE CAREFUL
Python
is case-sensitivetoto
, Toto
and toTo
are all different!if
and else
.for
or while
.If
statementsmy_variable
which was equal to…?my_bool = my_variable > 50 ; print(my_bool)
## False
my_bool = my_variable <= 50 ; print(my_bool)
## True
my_bool = my_variable == 50 ; print(my_bool)
## False
my_bool
is yet another type of object called boolean, which says True
or False
.if
statements will create booleans to make a decision.If
statementsmy_variable
is above 9,000.if my_variable > 9000: print("It's over 9,000!") elif my_variable == 9000: print("It's exactly 9000") else: print("I have to stop watching Dragon Ball.")
## I have to stop watching Dragon Ball.
if [CONDITION]:
and go to the next line.elif
is the combination of else
and if
to test a second condition.else:
gathers all remaining cases.if
, elif
and else
is called indentation
If
statements==
(to not be mistaken with =
which assigns a value)!=
<
<=
>
>=
if a < 10 and a >= 5:
if a < 10 or a >= 5:
If
statementsa = 42 if a > 10: print("above ten") if a > 20: print("and also above twenty!") else: print("but not above twenty.")
## above ten ## and also above twenty!
else
only concerns the second test. If I defined a = 5
, what would happen?if
you need another indentation (won’t work if not anyway)If
statementsfor
loopsfor
loops allow you to perform the same operation for each element of an object.""" Initializing our result vector """ dummy_greater = np.zeros((100,1)) # Dimension is (100x1) """ Performing for loop """ for i in range(0,100): if a >= (i+1): dummy_greater[i] = 1
What does this code do?
range(start,end)
consists in all integers from start
to end-1
.
Don’t forget: first element of a vector is numbered 0.
for
loopsScore_teams = {} Score_teams["MTL"] = 4 Score_teams["VGK"] = 1 for names in Score_teams.keys(): print(Score_teams[names])
## 4 ## 1
for
loops# Constructing a vector of cashflows Final_cashflow_vector = np.zeros((364 * 5)) # Complete the rest
while
loopswhile
loops are a combination of for
and if
.gains
are below threshold
, otherwise liquidate position.for
and if
:import random as rd Gains = np.zeros(1) count = 0 threshold = 10 while Gains[count] < threshold: heads_tails = rd.randint(0,1) if heads_tails == 1: Gains = np.append(Gains, Gains[count] + 1) else: Gains = np.append(Gains, Gains[count] - 1) count +=1 # same as count = count + 1
while
loopsimport matplotlib.pyplot as plt plt.plot(Gains)
help
help(plt.plot)
## Help on function plot in module matplotlib.pyplot: ## ## plot(*args, scalex=True, scaley=True, data=None, **kwargs) ## Plot y versus x as lines and/or markers. ## ## Call signatures:: ## ## plot([x], y, [fmt], *, data=None, **kwargs) ## plot([x], y, [fmt], [x2], y2, [fmt2], ..., **kwargs) ## ## The coordinates of the points or line nodes are given by *x*, *y*. ## ## The optional parameter *fmt* is a convenient way for defining basic ## formatting like color, marker and linestyle. It's a shortcut string ## notation described in the *Notes* section below. ## ## >>> plot(x, y) # plot x and y using default line style and color ## >>> plot(x, y, 'bo') # plot x and y using blue circle markers ## >>> plot(y) # plot y using x as index array 0..N-1 ## >>> plot(y, 'r+') # ditto, but with red plusses ## ## You can use `.Line2D` properties as keyword arguments for more ## control on the appearance. Line properties and *fmt* can be mixed. ## The following two calls yield identical results: ## ## >>> plot(x, y, 'go--', linewidth=2, markersize=12) ## >>> plot(x, y, color='green', marker='o', linestyle='dashed', ## ... linewidth=2, markersize=12) ## ## When conflicting with *fmt*, keyword arguments take precedence. ## ## ## **Plotting labelled data** ## ## There's a convenient way for plotting objects with labelled data (i.e. ## data that can be accessed by index ``obj['y']``). Instead of giving ## the data in *x* and *y*, you can provide the object in the *data* ## parameter and just give the labels for *x* and *y*:: ## ## >>> plot('xlabel', 'ylabel', data=obj) ## ## All indexable objects are supported. This could e.g. be a `dict`, a ## `pandas.DataFame` or a structured numpy array. ## ## ## **Plotting multiple sets of data** ## ## There are various ways to plot multiple sets of data. ## ## - The most straight forward way is just to call `plot` multiple times. ## Example: ## ## >>> plot(x1, y1, 'bo') ## >>> plot(x2, y2, 'go') ## ## - Alternatively, if your data is already a 2d array, you can pass it ## directly to *x*, *y*. A separate data set will be drawn for every ## column. ## ## Example: an array ``a`` where the first column represents the *x* ## values and the other columns are the *y* columns:: ## ## >>> plot(a[0], a[1:]) ## ## - The third way is to specify multiple sets of *[x]*, *y*, *[fmt]* ## groups:: ## ## >>> plot(x1, y1, 'g^', x2, y2, 'g-') ## ## In this case, any additional keyword argument applies to all ## datasets. Also this syntax cannot be combined with the *data* ## parameter. ## ## By default, each line is assigned a different style specified by a ## 'style cycle'. The *fmt* and line property parameters are only ## necessary if you want explicit deviations from these defaults. ## Alternatively, you can also change the style cycle using ## :rc:`axes.prop_cycle`. ## ## ## Parameters ## ---------- ## x, y : array-like or scalar ## The horizontal / vertical coordinates of the data points. ## *x* values are optional and default to `range(len(y))`. ## ## Commonly, these parameters are 1D arrays. ## ## They can also be scalars, or two-dimensional (in that case, the ## columns represent separate data sets). ## ## These arguments cannot be passed as keywords. ## ## fmt : str, optional ## A format string, e.g. 'ro' for red circles. See the *Notes* ## section for a full description of the format strings. ## ## Format strings are just an abbreviation for quickly setting ## basic line properties. All of these and more can also be ## controlled by keyword arguments. ## ## This argument cannot be passed as keyword. ## ## data : indexable object, optional ## An object with labelled data. If given, provide the label names to ## plot in *x* and *y*. ## ## .. note:: ## Technically there's a slight ambiguity in calls where the ## second label is a valid *fmt*. `plot('n', 'o', data=obj)` ## could be `plt(x, y)` or `plt(y, fmt)`. In such cases, ## the former interpretation is chosen, but a warning is issued. ## You may suppress the warning by adding an empty format string ## `plot('n', 'o', '', data=obj)`. ## ## Other Parameters ## ---------------- ## scalex, scaley : bool, optional, default: True ## These parameters determined if the view limits are adapted to ## the data limits. The values are passed on to `autoscale_view`. ## ## **kwargs : `.Line2D` properties, optional ## *kwargs* are used to specify properties like a line label (for ## auto legends), linewidth, antialiasing, marker face color. ## Example:: ## ## >>> plot([1, 2, 3], [1, 2, 3], 'go-', label='line 1', linewidth=2) ## >>> plot([1, 2, 3], [1, 4, 9], 'rs', label='line 2') ## ## If you make multiple lines with one plot command, the kwargs ## apply to all those lines. ## ## Here is a list of available `.Line2D` properties: ## ## Properties: ## agg_filter: a filter function, which takes a (m, n, 3) float array and a dpi value, and returns a (m, n, 3) array ## alpha: float or None ## animated: bool ## antialiased or aa: bool ## clip_box: `.Bbox` ## clip_on: bool ## clip_path: Patch or (Path, Transform) or None ## color or c: color ## contains: callable ## dash_capstyle: {'butt', 'round', 'projecting'} ## dash_joinstyle: {'miter', 'round', 'bevel'} ## dashes: sequence of floats (on/off ink in points) or (None, None) ## data: (2, N) array or two 1D arrays ## drawstyle or ds: {'default', 'steps', 'steps-pre', 'steps-mid', 'steps-post'}, default: 'default' ## figure: `.Figure` ## fillstyle: {'full', 'left', 'right', 'bottom', 'top', 'none'} ## gid: str ## in_layout: bool ## label: object ## linestyle or ls: {'-', '--', '-.', ':', '', (offset, on-off-seq), ...} ## linewidth or lw: float ## marker: marker style ## markeredgecolor or mec: color ## markeredgewidth or mew: float ## markerfacecolor or mfc: color ## markerfacecoloralt or mfcalt: color ## markersize or ms: float ## markevery: None or int or (int, int) or slice or List[int] or float or (float, float) ## path_effects: `.AbstractPathEffect` ## picker: float or callable[[Artist, Event], Tuple[bool, dict]] ## pickradius: float ## rasterized: bool or None ## sketch_params: (scale: float, length: float, randomness: float) ## snap: bool or None ## solid_capstyle: {'butt', 'round', 'projecting'} ## solid_joinstyle: {'miter', 'round', 'bevel'} ## transform: `matplotlib.transforms.Transform` ## url: str ## visible: bool ## xdata: 1D array ## ydata: 1D array ## zorder: float ## ## Returns ## ------- ## lines ## A list of `.Line2D` objects representing the plotted data. ## ## See Also ## -------- ## scatter : XY scatter plot with markers of varying size and/or color ( ## sometimes also called bubble chart). ## ## Notes ## ----- ## **Format Strings** ## ## A format string consists of a part for color, marker and line:: ## ## fmt = '[marker][line][color]' ## ## Each of them is optional. If not provided, the value from the style ## cycle is used. Exception: If ``line`` is given, but no ``marker``, ## the data will be a line without markers. ## ## Other combinations such as ``[color][marker][line]`` are also ## supported, but note that their parsing may be ambiguous. ## ## **Markers** ## ## ============= =============================== ## character description ## ============= =============================== ## ``'.'`` point marker ## ``','`` pixel marker ## ``'o'`` circle marker ## ``'v'`` triangle_down marker ## ``'^'`` triangle_up marker ## ``'<'`` triangle_left marker ## ``'>'`` triangle_right marker ## ``'1'`` tri_down marker ## ``'2'`` tri_up marker ## ``'3'`` tri_left marker ## ``'4'`` tri_right marker ## ``'s'`` square marker ## ``'p'`` pentagon marker ## ``'*'`` star marker ## ``'h'`` hexagon1 marker ## ``'H'`` hexagon2 marker ## ``'+'`` plus marker ## ``'x'`` x marker ## ``'D'`` diamond marker ## ``'d'`` thin_diamond marker ## ``'|'`` vline marker ## ``'_'`` hline marker ## ============= =============================== ## ## **Line Styles** ## ## ============= =============================== ## character description ## ============= =============================== ## ``'-'`` solid line style ## ``'--'`` dashed line style ## ``'-.'`` dash-dot line style ## ``':'`` dotted line style ## ============= =============================== ## ## Example format strings:: ## ## 'b' # blue markers with default shape ## 'or' # red circles ## '-g' # green solid line ## '--' # dashed line with default color ## '^k:' # black triangle_up markers connected by a dotted line ## ## **Colors** ## ## The supported color abbreviations are the single letter codes ## ## ============= =============================== ## character color ## ============= =============================== ## ``'b'`` blue ## ``'g'`` green ## ``'r'`` red ## ``'c'`` cyan ## ``'m'`` magenta ## ``'y'`` yellow ## ``'k'`` black ## ``'w'`` white ## ============= =============================== ## ## and the ``'CN'`` colors that index into the default property cycle. ## ## If the color is the only part of the format string, you can ## additionally use any `matplotlib.colors` spec, e.g. full names ## (``'green'``) or hex strings (``'#008000'``).
python
functions.blackscholesanalytics
for instance)functions
that can perform any set of operations you want.print(Option_data.columns)
## Index(['quote_date', 'underlying_symbol', 'root', 'expiry', 'strike', 'type', ## 'open_interest', 'total_volume', 'high', 'low', 'open', 'last', ## 'last_bid_price', 'last_ask_price', 'underlying_close', 'series_type', ## 'product_type'], ## dtype='object')
print(Option_data.shape)
## (228, 17)
print(Option_data.head(5))
## quote_date underlying_symbol ... series_type product_type ## 0 1990-01-02 ^SPX ... NaN NaN ## 1 1990-01-02 ^SPX ... NaN NaN ## 2 1990-01-02 ^SPX ... NaN NaN ## 3 1990-01-02 ^SPX ... NaN NaN ## 4 1990-01-02 ^SPX ... NaN NaN ## ## [5 rows x 17 columns]
reduced_Option_data = Option_data[["quote_date", "strike", "expiry","type", "last_ask_price", "underlying_close"]] print(reduced_Option_data)
## quote_date strike expiry type last_ask_price underlying_close ## 0 1990-01-02 275.0 1990-03-17 C 86.88 359.69 ## 1 1990-01-02 275.0 1990-03-17 P 0.94 359.69 ## 2 1990-01-02 300.0 1990-03-17 C 62.75 359.69 ## 3 1990-01-02 300.0 1990-03-17 P 1.38 359.69 ## 4 1990-01-02 325.0 1990-03-17 C 39.75 359.69 ## .. ... ... ... ... ... ... ## 223 1990-01-02 350.0 1990-12-22 P 16.00 359.69 ## 224 1990-01-02 375.0 1990-12-22 C 19.13 359.69 ## 225 1990-01-02 375.0 1990-12-22 P 26.13 359.69 ## 226 1990-01-02 400.0 1990-12-22 C 10.13 359.69 ## 227 1990-01-02 400.0 1990-12-22 P 40.00 359.69 ## ## [228 rows x 6 columns]
strike/underlying
.def Compute_moneyness (strike, underlying): return strike/underlying Compute_moneyness(reduced_Option_data["strike"][0], reduced_Option_data["underlying_close"][0])
## 0.7645472490199895
Compute_moneyness
function to any option.for
loop to compute the moneyness of all options.def Compute_moneyness (strike, underlying, opt_type): Moneyness = strike/underlying if Moneyness == 1: result = "ATM" elif Moneyness < 1: if opt_type == "P": result = "OTM" else: result = "ITM" else: if opt_type == "P": result = "ITM" else: result = "OTM" return result Compute_moneyness(reduced_Option_data["strike"][0], reduced_Option_data["underlying_close"][0], reduced_Option_data["type"][0])
## 'ITM'
def
(argument)
, separated by commas.return
.np.matmult
to compute the price of the bond.coupon
, FV
, maturity
and discount_rate
.R
x_int <- 12 # Integer x_flo <- 1.5 # float x_vec <- c(1,5,12) # vector of 3 elements x_seq <- seq(0, 10, by = 0.5) # vector of all floats from 0 to 10 by step of 0.5 x_mat <- matrix(1:6, nrow = 2, ncol = 3) # matrix (2x3) filled by COLUMNS first # Lists x_lst <- list("a" = c(1:3), "b" = "toto"); print(x_lst)
## $a ## [1] 1 2 3 ## ## $b ## [1] "toto"
# Data frames x_dfm <- data.frame("first_var" = c(5:8), "second_var" = c("a","b","c","d")); print(x_dfm)
## first_var second_var ## 1 5 a ## 2 6 b ## 3 7 c ## 4 8 d
# matrices #--------- # Zeros: x_zeros <- rep(0, 10); x_mat_zeros <- matrix(0, nrow = 2, ncol = 3) # Ones: same principle replacing 0 by 1 in previous expressions # Diagonal matrix x_diag <- diag(1,10) # (10x10) identity matrix # Multidimensional arrays x_array <- array(0, dim = c(2,3,5)) # (2x3x5) array of numbers # Transpose matrix x_transpose <- t(x_mat) # (3x2) matrix # Inverse matrix x_inv <- solve(x_diag) # Matrix multiplication x_transpose %*% x_mat
## [,1] [,2] [,3] ## [1,] 5 11 17 ## [2,] 11 25 39 ## [3,] 17 39 61
# If tests x <- 42 if (x > 10) { print("x > 10") }
## [1] "x > 10"
# Ifs can be called on the fly my_vec <- c(-5:5) print(my_vec[my_vec < 0])
## [1] -5 -4 -3 -2 -1
my_vec[my_vec < 0] <- NA print(my_vec)
## [1] NA NA NA NA NA 0 1 2 3 4 5
# For loops y <- 0 for (i in 1:10){ y = y + 1 } print(y)
## [1] 10
# While loops y <- 0 while (y < 10){ y = y + 1 } print(y)
## [1] 10
for
loops: applymy_data <- array(c(1:80), dim = c(4,2,10)) # for each indiv and charac, compute the mean (4x2 matrix). ts_mean <- apply(my_data, c(1,2), mean) # for each charac and time, compute the mean (2x10 matrix). indiv_mean <- apply(my_data, c(2,3), mean) # for each characteristic, compute the mean (2x1 vector). charac_mean <- apply(my_data, 2, mean) print(ts_mean) ; print(charac_mean)
## [,1] [,2] ## [1,] 37 41 ## [2,] 38 42 ## [3,] 39 43 ## [4,] 40 44
## [1] 38.5 42.5
x_vec <- c(NA, 1, 1, 1, rep(2,3)) mean(x_vec)
## [1] NA
# My function computes a mean and drops NAs Mean_drop_NA <- function(vector){ result <- mean(vector, na.rm = T) return(result) } Mean_drop_NA(x_vec)
## [1] 1.5
apply
apply(my_data, c(2,3), function(x){mean(x, na.rm = T)})
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] ## [1,] 2.5 10.5 18.5 26.5 34.5 42.5 50.5 58.5 66.5 74.5 ## [2,] 6.5 14.5 22.5 30.5 38.5 46.5 54.5 62.5 70.5 78.5
data.frame
elementsprint(x_dfm)
## first_var second_var ## 1 5 a ## 2 6 b ## 3 7 c ## 4 8 d
x_dfm[,1] == x_dfm$first_var
## [1] TRUE TRUE TRUE TRUE
x_dfm["first_var"]
## first_var ## 1 5 ## 2 6 ## 3 7 ## 4 8